A Random Forest Approach for Authorship Profiling

نویسندگان

  • Alonso Palomino Garibay
  • Adolfo T. Camacho-González
  • Ricardo A. Fierro-Villaneda
  • Irazú Hernandez-Farias
  • Davide Buscaldi
  • Iván V. Meza
چکیده

In this paper we present our approach to extract profile information from anonymized tweets for the author profiling task at PAN 2015 [10]. Particularly we explore the versatility of random forest classifiers for the genre and age groups information and random forest regressions to score important aspects of the personality of a user. Furthermore we propose a set of features tailored for this task based on characteristics of the twitters. In particular, our approach relies on previous proposed features for sentiment analysis tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Authorship Verification: An Approach based on Random Forest: Notebook for PAN at CLEF 2015

Authorship attribution, being an important problem in many areas including information retrieval, computational linguistics, law and journalism etc., has been identified as a subject of increasingly research interest in the recent years. In case of Author Identification task in PAN at CLEF 2015, the main focus was given on cross-genre and cross-topic author verification tasks. We have used seve...

متن کامل

Random Forest with Increased Generalization: A Universal Background Approach for Authorship Verification

This article describes our approach for the Author Identification task introduced in PAN 2015. Given a set of documents written by the same author and a questioned document with an unknown author, the task is to decide whether the questioned document was written by the same author as the other documents or not. Our approach uses Random Forest and a feature-encoding scheme based on the Universal...

متن کامل

Extracting speaker-specific functional expressions from political speeches using random forests in order to investigate speakers’ individual political styles

In this study we extracted speaker-specific functional expressions from political speeches using random forests in order to investigate speakers’ individual political styles. Along with methodological development, stylistics has expanded its scope into new areas of application such as authorship profiling and sentiment analysis in addition to conventional areas such as authorship attribution an...

متن کامل

Bearing Capacity of Shallow Foundations on Cohesionless Soils: A Random Forest Based Approach

Determining the ultimate bearing capacity (UBC) is vital for design of shallow foundations. Recently, soft computing methods (i.e. artificial neural networks and support vector machines) have been used for this purpose. In this paper, Random Forest (RF) is utilized as a tree-based ensemble classifier for predicting the UBC of shallow foundations on cohesionless soils. The inputs of model are wi...

متن کامل

Ensemble Learning Approach for Author Profiling

With the evolution of internet, author profiling has become a topic of great interest in the field of forensics, security, marketing, plagiarism detection etc. However the task of identifying the characteristics of the author just based on a text document has its own limitations and challenges. This paper reports on the design, techniques and learning models we adopted for the PAN-2014 Author P...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015